Datasets

This chapter will briefly review datasets for music demixing by summarizing the previous tutorial. You can find a more detailed introduction and explanation from the previous tutorial.

Data for Music Demixing

At a high level, the inputs and outputs of a source separation model look like this:

../_images/source_separation_io.png

Fig. 1 Inputs and outputs of a source separation model.

MUSDB18: tutorial

The MUSDB18 dataset [RLStoter+17] is one of the most widely used datasets for music demixing. For example, its uncompressed version (also known as MUSDB18-HQ [RLS+19]) was the official training dataset for Leaderboard A of the MDX challenge.

This section shows how to play with the musdb package.

Frist, install musdb pacakge.

pip install musdb

After the installation, please load musdb with download=True. This will download 7 seconds sample tracks of MUSDB18.

import musdb
mus = musdb.DB(download=True)

We can use mus as a iterator.

print(len(mus))
144

Let us load the first track of the MUSDB18 dataset

track = mus[0]
print(track)
A Classic Education - NightOwl

Let us listen to the mixture (i.e., the input audio in Fig 1!)

from IPython.display import Audio, display

display(Audio(track.audio.T, rate=track.rate))

Let us listen to the output audio tracks (i.e., the targets)

for source in track.sources.keys():
    print('source name: {}'.format(source))
    display(Audio(track.sources[source].audio.T, rate=track.rate))    
source name: vocals
source name: drums
source name: bass
source name: other

Thus, the input and output of the MUSDB18’s music demixing task are:

  • input: track.audio

  • output: {source: track.sources[source].audio for source in ['vocals', 'drums', 'bass', 'other']}

Quick overview of existing datasets

In the MDX challenge, participants must train their system on the training set of MUSDB18-HQ dataset (or MUSDB18 dataset) for Leaderboard A. For Leaderboard B, there have been no constraints in the choice of training data. (i.e., any available datasets can be used by the participants).

Here’s a quick overview of existing datasets released after 2015 for Music Demixing:

Dataset

Year

Instrument categories

Tracks

Avgerage duration (s)

Full songs

Stereo

DSD100

2015

4

100

251 \(\pm\) 60

MUSDB18

2017

4

150

236 \(\pm\) 95

MUSDB18-HQ

2019

4

150

236 \(\pm\) 95

Slakh2100

2019

34

2100

249

You can check the full list of datasets here. This extended table is based on: SigSep/datasets, and reproduced with permission.